Resource Constrained Multimedia Event Detection
نویسندگان
چکیده
We present a study comparing the cost and efficiency tradeoffs of multiple features for multimedia event detection. Low-level as well as semantic features are a critical part of contemporary multimedia and computer vision research. Arguably, combinations of multiple feature sets have been a major reason for recent progress in the field, not just as a low dimensional representations of multimedia data, but also as a means to semantically summarize images and videos. However, their efficacy for complex event recognition in unconstrained videos on standardized datasets has not been systematically studied. In this paper, we evaluate the accuracy and contribution of more than 10 multi-modality features, including semantic and low-level video representations, using two newly released NIST TRECVID Multimedia Event Detection (MED) open source datasets, i.e. MEDTEST and KINDREDTEST, which contain more than 1000 hours of videos. Contrasting multiple performance metrics, such as average precision, probability of missed detection and minimum normalized detection cost, we propose a framework to balance the trade-off between accuracy and computational cost. This study provides an empirical foundation for selecting feature sets that are capable of dealing with large-scale data with limited computational resources and are likely to produce superior multimedia event detection accuracy. This framework also applies to other resource limited multimedia analysis such as selecting/fusing multiple classifiers and different representations of each feature set.
منابع مشابه
Detecting communities of workforces for the multi-skill resource-constrained project scheduling problem: A dandelion solution approach
This paper proposes a new mixed-integer model for the multi-skill resource-constrained project scheduling problem (MSRCPSP). The interactions between workers are represented as undirected networks. Therefore, for each required skill, an undirected network is formed which shows the relations of human resources. In this paper, community detection in networks is used to find the most compatible wo...
متن کاملHybrid Layered Video Encoding for Mobile Internet-Based Computer Vision and Multimedia Applications
Mobile networked environments are typically resource constrained in terms of the available bandwidth and battery capacity on mobile devices. Realtime video applications entail the analysis, storage, transmission, and rendering of video data, and are hence resource-intensive. Since the available bandwidth in the mobile Internet is constantly changing, and the battery life of a mobile video appli...
متن کاملA Hybrid Layered Video Encoding Technique for Mobile Internet-based Vision
The increasing deployment of broadband networks and simultaneous proliferation of low-cost video capturing and multimedia-enabled mobile devices have triggered a new wave of mobile Internet-based computer vision applications. However, mobile networked environments are typically resource constrained in terms of the available bandwidth and battery capacity on mobile devices. Computer vision appli...
متن کاملDecentralized algorithms for classifier topology optimization in large-scale multi-concept detection
Multi-concept identification in high volume multimedia streams is critical for a number of applications, including large-scale multimedia analysis, processing, and retrieval. Content of interest is filtered using a collection of binary classifiers that are deployed on distributed resource-constrained infrastructure. In this paper, we design distributed algorithms for determining the optimal top...
متن کاملAudio self organized units for high-level event detection
High-level multimedia event detection aims to identify videos containing a target event. Recent approaches leveraging audio information for this task fall into two broad categories. The first corresponds to holistic bag-of-words approaches based on frame-level descriptors. These are effective for classification, but hard for humans to interpret. The second corresponds to approaches that build a...
متن کامل